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RECOVERY OF ORIGINAL TEMPLATE 



BACKGROUND 

Microairays are molecular probes such as nucleic add molecules arranged systematically onto 
a solid, generally flat surface. Each probe site carries a reagent such as a singjie stranded 
nucleic add, whose molecular recognition of a complementary nucleic add molecule leads to 
a detectable signal, often based on fluorescence. Nficroarrays carrying many thousands of 
probe sites can be used to monitor ^ne expression profiles over a large number of genes in a 
single experiment on a hybridisation based format. 

The nudeic add probes on the microairays are generally made in two ways. A 
combination of photochemistry and DNA synthesis allows base-by-base synthesis of the 
probes in situ. This is the approach pioneered by AfCymetrix for growing short strands of 
around 25 bases. Their *genediq)s' are conmierdally available and widely used {e.g., 
Wodlicka et al, 1997, Nature Biotechnology 15:1359-1367), despite the expense of making 
arrays designed for a particular e^riment Another method for preparing microarrays is to 
use a ro))Ot to spot small (nL) volumes of nudeic add sequences onto discreet areas of the 
surface. Microarrays prepared in this manner have less dense features than Affymetrix arrays 
but are more universal and dieaper to prepare (eg., Schena et al^ 1995, Sdence 270:467- 
470). The main drawback of all types of standard microarrays is the complex hardware 
required to adiieve a spatial distribution of multiple copies of the same DNA sequence. Such 
limitations are overcome by single molecule array technology, eg., as described in 
International Patent App. WO 00/06770. 
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In addition to liybridisadon-based detection a number of othor biochemical assays 
have been applied to nucleic acid microarrays, particularly in the area of genotyping. A 
common assay is to use a DNA polymerase or DNA ligase to incorporate a fluorescent marker 
onto the array. The enzyme incorporation allows the identity of one or more bases to be 
5 determined based on ihe identity of the labelled marker. Such extension assays have been 
developed by a number of companies and academic groiq)s for typing single nucleotide 
polymorphisms C^SNPs'*). The ability to perform multiple cycles of extension reactions on 
these platforms would be advantageous as it gives more information about the nature of the 
sanq)le imder investigation. 

10 For example, performing multiple extensions complementary to a template strand 

yields information on the sequence of the template strand. During such a ^sequencing by 
synthesis' reaction, a new strand, base-paired to the template nucleic acid, is built iq) in the S' 
to 3' direction by incorporation of individual nucleotides complementary to those nucleotides 
in the template starting at its 3' end. The end result of a series of such incorporations is that 

IS the single-stranded template nucleic acid is no longer single-stranded; instead, it is base- 
paired to a synthetic complementary strand. The result is a double-stranded nucleic acid 
molecule: the original template nucleic acid and its complementary strand, attached to the 
solid substrate. 

Once such a sequencing reaction is complete, removal of the synthetic strand 
20 complementary to the template would permit re-use of the template nucleic acid, e.g,, in 

another sequencing reaction to verify the results of the first reaction. In another application, 
the sequenced strand becomes available for hybridization of nucleic acid, e.g., DNA or DNA 
mimics, e.g., PNA. 

In contrast, the complete removal of both the template strand and its synthetic 
25 complement would allow new template nucleic acids to be attached to the solid substrate to 
form a new array. 

SUMMARY OF THE INVENTION 

The invention relates to a hairpin nucleic acid, or a double-stranded nucleic acid 
30 anchor, vMch allows templates to be regenerated according to the invention. In particular, 
the invention features a hairpin nucleic acid or double-stranded nucleic acid anchor containing 
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a restriction site, preferably for a nickiiig endonuclease, located before or at the 3* end of the 
hairpin nucleic acid. The present invention also relates to a method for regenerating a single- 
stranded nucleic acid template following its conversion to a double-stranded product, e.g., as a 
result of a polymerase reactioiL 
S The invention features a hairpin nucleic acid, having the following characteristics: (a) 

being self-complementary; and (b) having a first restriction site for a nicking endonuclease, 
the restriction site including a recognition sequence and a cleavage site, where the recognition 
sequence is situated so that the cleavage site is before, at, or beyond the 3* end of the hairpin 
nucleic acid. The hairpin nucleic acid can further include one or more modifications to allow 

10 hairpin nucleic acid attachment to a solid substrate. The hairpin nucleic acid can also further 
include a second restriction site for a blunt-end endonuclease, the second restriction site 
including a second recognition sequence and a second cleavage site, where the second 
recognition sequence is situated so that the second cleavage site is before, at, or beyond the 3' 
end of the hairpin nucleic acid. 

1 S The uivention also features a method for recovering a single-stranded template nucleic 

acid, the method including: (a) providing a single-stranded template nucleic acid attached to 
the 5' end of a hairpin nucleic acid, where the hairpin nucleic acid is self-complementary and 
has a first restriction site for a nicking endonuclease, the restriction site including a 
recognition sequence and a cleavage site, where the recognition sequence is situated so that 

20 the cleavage site is before, at, or beyond the 3 ' end of the hairpin nucleic acid, and where the 
hairpin nucleic acid is a self-hybrid, and where a nucleic acid strand complementary to the 
template nucleic acid is attached to the 3' end of the hairpin nucleic acid; (b) contacting the 
hairpin nucleic acid with the nicking endonuclease, imder conditions where the nicking 
endonuclease cleaves before, at or beyond the 3' end of the haixpin nucleic acid, thereby 

25 providing a nicked hairpin-template-complement nucleic acid complex; and (c) subjecting the 
nicked hairpin-template-complement nucleic acid complex to conditions whereby the nucleic 
acid strand complementary to the template nucleic acid dissociates from the template nucleic 
acid; thereby recovering the single-stranded template nucleic acid. The hairpin nucleic acid 
can be attached to a solid substrate. 

30 In another aspect, the invration features an addressable single molecule array, 

including a hairpin nucleic acid as described above, where the hairpin nucleic acid is attached 
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to a solid substrate. Adjacent haizpin nucleic adds in such an array can be separated by a 
distance of at least IQnm, of at least lOOnm, or of at least 2S0nm. The density of the haixpin 
nucleic acids can be from 10^ to 10^ polynucleotides per cm^, or from 10^ to 10^ molecules 
per cm^. 

S The invention also features a lot including a hairpin nucleic acid as described above, 

and packaging components therefor. The invention also features a kit which includes an 
addressable array as described above. 

In another aspect, the invention features a double-stranded nucleic acid anchor, having 
the following characteristics: (a) having a first end and a second end; and (b) having a first 

1 0 restriction site for a nicking endonuclease, the restriction site including a recognition sequence 
and a cleavage site, vs^ere the recognition sequence is situated so that the cleavage site is 
located before, at, or beyond the 3' end of the first end of the double-stranded nucleic acid 
anchor. The double-stranded nucleic acid anchor can be attached at its second end to a solid 
substrate. The double-stranded nucleic acid anchor can further include a second restriction 

1 S site for a blimt-end endonuclease, the second restriction site including a second recognition 
sequence and a second cleavage site, where the second recognition sequence is situated so that 
the second cleavage site is located before, at, or beyond the 3 ' end of the first end of the 
double-stranded nucleic acid anchor. 

The invention also features a method for recovering a single-stranded template nucleic 

20 acid, the method including: (a) providing a single-stranded template nucleic acid attached to a 
double-stranded nucleic acid anchor, and wiiere a nucleic acid strand complementary to the 
template nucleic acid is attached to the double-stranded nucleic acid anchor, and where the 
double-stranded nucleic acid anchor: (i) has a first end and a second end; and (ii) has a first 
restriction site for a nicking endonuclease, the restriction site including a recognition sequence 

25 and a cleavage site, where the cleavage site is situated so that the cleavage site is before, at, or 
beyond the 3' end of the first end of the double-stranded nucleic acid anchor; where the 
single-stranded template nucleic acid is attached to the 5' end of the first end of the double- 
stranded niicleic acid anchor, and where the nucleic acid strand complementary to the 
template nucleic acid is attached to the 3' end of the first end of the double-stranded nucleic 

30 acid anchor (b) contacting the double-stranded nucleic acid anchor with the nicking 

endonuclease, under conditions vAiere &e nicking endonuclease cleaves before, at, or beyond 
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the 3' end of tbe first end of die double-stranded nucleic acid anchor, diereby providing a 
nicked anchor-template-complement nucleic acid complex; and (c) subjecting the nicked 
anchor-template-complement nucleic acid complex to conditions whereby the nucleic acid 
strand complementary to the template nucleic acid dissociates from the template nucleic acid; 
S thereby recovering the single-stranded template nucleic acid. The double-stranded nucleic 
acid anchor can be attached at its second end to a solid substrate. 

In another aspect, the invention features an addressable single molecule array, 
including a double-stranded nucleic acid anchor as described above, vs^ere the double- 
stranded nucleic acid anchor is attached to a solid substrate. Adjacent double-stranded 

10 nucleic acid anchors in such an array can be separated by a distance of at least lOnm, of at 
least lOOnm, or of at least 2S0nm. The density of the double-stranded nucleic acid anchors 
can be from 10^ to 10^ polynucleotides per cm^, or from 10^ to 10^ molecules per cm^. 

The invention also features a kit including a double-stranded nucleic acid anchor as 
described above, and packaging components therefor. The invention also features a kit vMch 

15 includes an addressable array as described above. 

In one embodiment, "hairpin nucleic acid" means a single-stranded nucleic acid which 
is capable of forming a hairpin, that is, a nucleic acid vsdiose sequence contains a region of 
internal self-complementarity enabling the formation of an intramolecular duplex or self- 
hybrid. *TRegion of self-complementarity" refers to self-complementarity over a region of 4 to 

20 100 base pairs. When not self-hybridized, the hairpin nucleic acid can be 8 to 200 base pairs, 
preferably 10 to 30 base pairs in length. By saying that the hairpin nucleic acid is a "self- 
hybrid", or that the hairpin nucleic acid has "self-hybridized", means that the hairpin nucleic 
acid has been exposed to conditions that allow its regions of self-complementarity to 
hybridize to each other, forming a double-stranded nucleic acid with a loop structure at one 

25 end and an exposed 3 ' and 5 ' end at the other. It is preferable, but not required, that when 
hybridized to itself, the exposed 3' and 5' ends form a blunt end. 

The hairpin nucleic acid can also possess one or more moieties which allow the 
hairpin nucleic acid to be attached to a solid substrate. Generally, such moieties will be 
located together in the vicinity of the center of the haiipin nucleic acid, so that when the 

30 haitpin nucleic acid has self-annealed, the moiety is located at the bend in the hairpin. 
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allowingthe bend to be attached to a solid substrate. The haiipin can be self-hybridized 

before or after attachment to the substrate. 

In one embodiment, the hairpin nucleic acid is a molecular stem and loop structure 

formed from the hybridisation of complementary polynucleotides. The stem comprises the 
S hybridized polynucleotides and the loop is the region that covalently links the two 

complementary polynucleotides. Anydiing from a 4 to 100 base pair double-stranded 

(duplex) region may be used to form the stem. 

In another embodiment, the hairpin nucleic acid is a molecule which is synthesized in 

a contiguous feshion but is not made vp entirely of DNA, rather the ends of the molecule 
10 comprise DNA bases that are self-complementary and can thus form an intramolecular 

duplex, while the middle of the molecule includes one or more non-nucleic acid molecules. 

An example of such a hairpin nucleic acid would be Nu-Nu-Nu-Nu-Nu-LM-Nc-Nc-Nc-Nc- 

Nc, where '"Nu" is a particular nucleotide, **Nc" is the nucleotide complementary to Nu, and 

"LM" is the linker moiety linking the two strands, e.g.^ hexaethylene glycol (HEG) or 
1 5 polyethylene glycol (PEG). Hie non-nucleic acid molecule(s) can be linker moieties for 

linking the two nucleic acids together (the two nucleic acid halves of the overall hairpin 

nucleic acid), and can also be used to attach the overall hairpin nucleic acid to the substrate. 

Alternatively, the non-nucleic acid molecule(s) can be intermediate molecules which are in 

turn attached to linker moieties used for attaching the overall hairpin nucleic acid to the solid 
20 substrate. 

In another embodiment, the hairpin nucleic acid is composed of two separate but 
complementary nucleic acid strands that are hybridized together to form an intermolecxilar 
duplex, and aie then covalently linked together. The linkage can be accomplished by 
chemical crosslinking of the two strands, attaching both strands to one or more intercalators or 

25 chemical crosslinkers, etc. 

By "double-stranded nucleic acid anchor*', or "anchor", is meant a segment of double- 
stranded nucleic acid A^ch, like the hairpin nucleic acid described above, is designed to 
contain one or more restriction sites capable of being acted on by one or more restriction 
endonucleases, e.g,y a nicking endonuclease. The double-stranded nucleic acid anchor will 

30 have a first end and a second end. The first end is used for attachment of the template nucleic 
acid and the strand complementary to tiie template nucleic acid. The second end of the 
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double-Stranded nucleic acid anchor can possess one or more nucleotides wriiich are modified 
to allow the double-stranded nucleic acid anchor to be attached to a solid substrate. Because 
the anchor is double-stranded, bodi the first end and the second end will eaxAk have a strand 
with a 3' end, and a strand with a 5' end. The anchor can be a double-stranded 
5 oligonucleotide bonded to the substrate, or two single-stranded oligonucleotides bonded to the 
substrate and than hybridized. 

Thus, the terms "hairpin," "hairpin nucleic acid," and "double-stranded nucleic acid 
anchor" include cross-linked {e.g., hybridized, chemically cross-linked, etc.) duplex nucleic 
acids or nucleic acid mimics peptide nucleic acids (PNA)) which are capable of being 
10 recognizi^ and acted upon by raidonucleases and polymerases. 

The hairpin nucleic acids and double-stranded nucleic acid anchors generally exist as 
molecules in solution before being attached to &e solid substrate. In the case of hairpin 
nucleic acids, the hairpin nucleic acid can be hybridized to itself before or after it is attached 
to flw substrate. In ttie case of double-stranded nucleic acid anchors, the two nucleic acid 
1 5 strands of the anchor can be hybridized together, and the anchor then attached to the substrate, 
or the individual single stranded components of the anchor can be attached to the surfece, and 
then hybridized together. 

The hairpin nucleic acids and double-stranded nucleic acid anchors (whether self- 
byridized or not) can be attached to the substrate in any way known in the art. Generally, 
20 such methods involve modifying the nucleic acid such that it contains a chemical groiq? or 
biochemical or other molecule {e.g., biotin or streptavidin, etc.) that is either inherently 
reactive with the substrate or can be activated to bond to the substrate. Modifications can be 
made to any part of tiie nucleic acid, including linkers being attached to the bases, sugars, 
phosphates, or at the 3' and 5' hydroxyl groups. Modification can be made at any part of the 
25 hairpin nucleic acid or double-stranded nucleic acid anchor to achieve surface attachment 
By saying that an endonuclease cuts "before, at or beyond the 3' end" of a hairpin 
nucleic acid, means that the "restriction site" for a given endonuclease comprises both a 
"recognition sequence" and a "cleavage site". The recognition sequence is the precise 
sequence of nucleotides recognized by a particular endonuclease, e.g., the recognition 
30 sequence for nicking endonuclease N.BbvCIA is "GCTGAGG" (see Table 1). The cleavage 
site for this endonuclease.is within this recognition sequence, between the "C" and the "T". 
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The lecognition sequence for N.BstNBI is "KjAGTCNNNN**, \^ere can be any 
nucleotide* The precise recognition sequence is tiierefore effectively The 
cleavage site for this endonuclease is four nucleotides 3* from the end of this recognition 
sequence. 

S There is no requirement that the restriction site be situated so that the radonuclease 

cuts or nicks exactly at the 3* end of the hairpin nucleic acid. The cleavage site can lie within 
the hairpin nucleic acid, lie at the very end of the hairpin nucleic acid, or lie outside of it. A 
restriction site situated with the cleavage site located at the end of the hairpin nucleic acid is 
shown in Fig. 1. 

10 There exist nicking endonucleases that nick (cleave) at a position 3 ' of the recognition 

sequence, that is, the recognition sequence and the cleavage site are separated by several (e.g., 
4-5) nucleotides. Such nicking endonucleases include N.AlwI, N.BspD6I, N.Bst9I, 
N.BsfNBI, N.BstSEI, where four random nucleotides separate the recognition sequence and 
the cleavage site, and N.Mlyl, v/here five random nucleotides separate the recognition 

1 S sequence and the cleavage site. 

There is also no requirement that tilie recognition sequence be separated from the 
cleavage site. As shown in Table 1, there exist nicking endonucleases that cut (cleave) within 
their recognition sequence (e,g,, N.BbvCIA, RBbvCDB, KBpulOIA, KBpulOIB, N.CviPII, 
N.CviQXI), similar to the action of an ordinary restriction endonuclease (/.e., an enzyme that 

20 cleaves through both strands of a double stranded nucleic acid). 

By saying that an endonuclease cuts "before" the 3* end of a hairpin nucleic acid 
means that the cleavage site for a particular endonuclease occurs before the 3* end of the 
hairpin nucleic acid, and that nucleotides will be removed from the 3' end of the hairpin 
nucleic acid. For instance, in the case of endonuclease N.BbvCIA, the placement of the 

25 recognition sequence for this endonuclease within a hairpin nucleic acid means that this 
endonuclease will, by definition, cleave at a point before the 3* end of the hairpin nucleic 
acid. 

By saymg that an endonuclease cuts "at" the 3 • end of a hairpin nucleic acid means 
that the cleavage site is situated so that the endonuclease cleaves at a point exactly between 
30 the 3' end of the hairpin niicleic acid and any nucleotides or nucleic acid strand added to it 
For instance, m the case of KBstMBI, the restriction site is "GAGTCNNNN'^". A hairpin 
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nucleic acid that ends in the sequence ...GAGTCATGC-3* will be cut «actiy at its 3* end by 
N3sfNBI, thereby removing any nucleotides incorporated onto the end of the haupin. 

By saying that an endonuclease cuts **beyond" the 3' end of a hairpin nucleic acid 
means that tiie cleavage site of the endonuclease cleaves at a point beyond the 3' end of the 
S hairpin, between nucleotides that have been added to the hairpin. For mstance, if a hairpin 
nucleic acid ends in the sequence ...GAGTC-3\ and has a strand attached to it that begins 
with 5'-AATTGGCC..,, then the endonuclease N.BstNBI will cut between T and G of the 
attached strand, that is, at GAGTC AATT^KJCC. 

If the recognition sequence in the hairpin nucleic acid is that of a nicking 

10 endonuclease that cleaves within its recognition sequence, the inclusion of such a recognition 
sequence in a hairpin nucleic acid will result in the removal of several nucleotides (/.e., two in 
the case of N.CviPH, N.CviQXI; five m the case of N.BbvCIA, N.BbvCIB, N.BpulOIA, 
N,BpulOIB) from the 3' end of the hairpin. Depending on the intended use of the hairpin 
nucleic acid, such a loss may be acceptable, as after removal of the complementary strand, the 

1 S limited number of nucleotides removed from tiie hairpin nucleic acid can be added back by 
using the same reaction as that used to build up the complementary strand in the first place. 

Some enzymes may not be useful for all applications. For instance, N.CviPII and 
N.CviQXI have very short recognition sequencesCC^CD and R'^AG, respectively), which nick 
frequently, and may therefore nick the template itself If the template is short, and does not 

20 contain these sequences, then these enzymes may be useful. 

There is no requirement that the restriction site be situated so that the endonuclease 
cuts or nicks exactly at the 3' end of the first end of the double-stranded nucleic acid anchor. 
The endonuclease can cut or nick just before the 3' end, if it is not necessary that perfect 
integrity of the double-stranded nucleic acid anchor be maintained. The endonuclease can 

25 also cut or nick beyond the 3' end of the double-stranded nucleic acid anchor, if it is not 
detrimental that nucleotides be effectively added to the anchor. 

If the recognition sequence in the hairpin nucleic acid is that of a nicking 
endonuclease that cleaves beyond the recognition sequence, the inclusion of such a 
recognition sequence in a hairpin nucleic acid will result in nicking of the strand at a location 

30 a few nucleotides beyond the recognition sequence. If the recognition sequence is located at 
the 3' end of the hairpin nucleic acid, then cleavage will occur 4-S nucleotides beyond the end 
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of the hairpin nucleic acid however, the 3' end of the recognition sequence for any of 
NAlwI, N.BspD6I, N£st9I, RBstNBI and RBstSEI is located four nucleotides fix)m the end 
of the hairpin nucleic acid, then these enzymes will cut exactly at the end of the hairpin 
nucleic acid. If, however, tiie 3' end of the recognition sequence for any of these enzymes is 
S located more than four nucleotides from &e 3* end of the hairpin nucleic acid, then the 
nicking endonuclease will nick before the 3* end of the hairpin. 

The endonuclease can cut or nick just before the 3 * end of the hairpin, if it is not 
necessary that perfect integrity of the hairpin be maintained. The endonuclease can also cut 
or nick beyond the 3 ' end of the hairpin nucleic acid, if it is not detrimental that nucleotides 

10 be effectively added to the hairpin. 

According to the invention, a hairpin nucleic acid is designed so that the restriction 
site for a nicking endonuclease is located so that the endonuclease will nick at a location 
before, at, or beyond the 3' end of the hairpin. The hairpin is then self-aimealed and a single- 
stranded template nucleic acid is attached to the S' end of the haiipiiL After a sequencing or 

1 5 other reaction builds a synthetic strand complementary to the template nucleic acid, the 

synthetic complementary strand can be removed by (1) nicking with the nicking endonuclease 
that recognizes the restriction site within the hairpin, so that a nick is made at a point before, 
at or beyond the 3' end of the hairpin, effectively "disconnecting" the synthetic 
complementary strand from the hairpin, so that the two are no longer contiguous, and (2) 

20 washing away the synthetic complementary strand, by standard denaturation, e.g., heat, 
formamide, NaOH, etc. 

Practice of the method of the invention with a double-stranded nucleic acid anchor is 
very similar to using a hairpin nucleic acid. The present application largely discusses use of 
hairpin nucleic acids in the invention, however, one of ordinary skill will readily understand 

25 that the double-stranded nucleic acid anchors can perform all of the same functions, and 
possess the same advantages over previous methods, as the hairpin nucleic acids. 

It is to be understood that in stating that the cut made by the endonuclease is ^'before, 
at, or beyond" the 3' end of the hairpin, it is meant that the cut is made in the vicinity of the 3* 
end of the hairpin, and that the recognition sequence for the endonuclease is not located at the 

30 S ' end of the hairpin nucleic acid resulting in cleavage within the S ' half of the hairpin nucleic 
acid. It is also imderstood that by saying Ibat tiie cut may be made "beyond" the 3' end of the 
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hairpin nucleic add. the distance beyond the 3* end is constrained by the distance between the 
recognition sequence and cleavage ate for «ie given endonudease. For instance, of the 
nicking endonucleases in Table 1, none nicks at a point ferther than five nucleotides fiom the 
recognition sequence. Therefore, no deavage will occur ferther ihan five nucleotides beyond 
the end of the 3' end of the haiipin nucleic acid, unless endonucleases are used which have 
cleavage sites that are further removed from tibeir recognition sequences. 

The hairpin nucleic acid or the double-stranded nucleic acid anchor can be attadied to 
a substrate, e.g., in a spatially-addressable array. 

'Template nucleic acid," or "single-stranded template nucleic acid," as used herein, 
means a linear single-stranded nucleic acid molecule which, when attached to the self- 
annealed hairpin nucleic acid (or anchor) described herein, is capable of being recognized and 
acted upon by a polymerase such that, under the proper conditions, the polymerase 
incorporates nucleotides onto the 3* end of the haiipin nucleic acid, where each nucleotide is 
complementary to the corresponding nucleotide on the template nucleic acid, thereby 
extending the 3' end of the hairpin and producing a nucleic acid strand complementary to the 
template nucleic acid. The term also includes a double-stranded nucleic acid that is attached 
to the haiipin, w4iere one strand is then removed, leaving a single strand. The term can also 
include the Ugation and covalent attachment of both strands of a double-stranded nucleic acid 
to the haiipin nucleic acid or double-stranded nucleic acid anchor, followed by nicking 
according to the methods described herein foUowed by washing to remove the nicked strand, 
that is, the method of the invention can itself be used in the attachment of the template nucleic 
acid to the hairpin nudeic add or the double-stranded nucleic acid anchor. Alternatively, one 
strand of a double-stranded nucldc acid can be Ugated to the haiipin nuddc acid or double- 
stranded nudeic acid anchor, and the second strand washed away. 

The template can be any length that can be successfully sequenced, preferably 10 to 
100 nucleotides, more preferably 15 to 100 nudeotides, most preferably 20 to 30 nudeotides. 
Although the term 'template nucleic acid" is used herein, it will be appreciated by one of 
ordinary skill that the invention is not limited to sequencing reactions, but that the techniques 
can be used to assay the interaction of the "templates" with other molecules. Such 
embodiments are described below. 
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By statmg tfiat the template is "^attached** to the haixpin or anchor is meant titiat the 
template nucleic acid is covalently attached. 

By stating that the polymerase will act upon the template and incorporate nucleotides 
onto the 3' end of the hairpin is meant that the polymerase vvill act given appropriate 
S conditions, such as ^propriate temperature, bufifers, pH, nucleotides, and other reaction 
components and conditions required for action by the polymerase. 

By ^"nucleic acid strand complementary to the template nucleic acid'% or '^synthetic 
nucleic acid strand complementary to the template nucleic acid'^ or more simply, 
^'complements is meant a strand of nucleic acid which possesses a sequence that is 
10 complementary to that of ihc template nucleic acid, that is, the complement and the template 
nucleic acids can hybridize and form a stretch of double-stranded nucleic acid* 

By stating that the template or complement is "attached'' to the hairpin or anchor is 
meant that the template nucleic acid or its complement are covalently attached. 

As used herein, the term '"array" refers to a population of hairpin nucleic acids or 
1 S double-stranded nucleic acid anchors that are distribiited over a solid support. The nucleic 
acids can be distributed in a sin^e molecule array, that is the nucleic acids are spaced at a 
distance fi:om one another sufBcient to permit their individual resolution. Alternatively, 
nucleic acids of one type can be clustered at a single address, when one or more nucleic acids 
at the address can be detected. 
20 "Solid support", as used herein, refers to the material to which the hairpins and/or 

anchors are attached. Suitable solid supports are available commercially, and will be apparent 
to the skilled person. The supports can be manufactured from materials such as glass, 
ceramics, silica and silicon. Supports with a gold surface may also be used. The supports 
usually comprise a flat (planar) surface, or at least a structure in which the molecules to be 
25 interrogated are in approximately the same plane. Alternatively, the solid support can be non- 
planar, e.g., a microbead. Any suitable size may be used. For example, the supports might be 
on the order of 1-10 cm in each direction. 

In one aspect of the invention, the "array*' is a device comprising a "single moleciile 
array," that is, a plurality of the hairpins and/or anchors of the invention, i.e., ttie hairpin 
30 and/or anchor molecules, are immobilized on the surface of a solid siq>port, such tiiat the 
molecules are at a density that permits individual resolution of at least two of the molecules 
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and their attached templates. ThiraKty" is used to mean that multiple molecules are placed 
on the array. The molecules can be of all the same type, or of multiple, i.e., different, types, 
i.e., tiie array can be composed entirely of hairpins, or entirely of anchors, or of a noixture of 
the two. In general, the haiipins/anchors are at a density of 10* to lO' individually resolvable 
polynucleotides per cm^ preferably lO' to 10* individually resolvable polynucleotides per 

In anothM- aspect of the invention, the "array" is a device comprising a high-density 
array, that is, v/inac each individual address on the array comprises a cluster of nucleotides of 
the same type, while another address on tiie array comprises a cluster of nucleotides of a 
different type. Detection of an address is done by detecting one or more individual 
nucleotides at the address. 

As used herein, the term •Mnterrogate" means contacting one or more of the hairpins 
and/or anchors with anotiier molecvde, e.g., a polymerase, a nucleoside triphosphate, a 
complemeaitary nucldc acid sequence, vdierein the physical interaction provides information 
regarding a diaiacteristic of the arrayed molecule and the template nucleic acid attached to it 
The contacting can involve covalent or non-covalent interactions with the other molecule. As 
used herein, **information regarding a characteristic" means information regarding the 
sequence of one or more nucleotides in the template, the length of tiie template, the base 
composition of the template, the Tm of the polynucleotide, the presence of a specific binding 
site for a polypeptide or other molecule, the presence of an adduct or modified nucleotide, or 
the three-dimensional structure of the template. 

The term "individually resolved by optical microscopy" is used herein to indicate that, 
when visualized, it is possible to distmguish at least one polynucleotide on the array firom its 
neighbouring polynucleotides using optical microscopy methods available in the art. 
Visualisation may be effected by tiie use of reporter labels, e.g., fluorophores, the signal of 
which is individually resolved. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a diagram illustrating an embodiment of the invention. 

Fig. 2 is a diagram illustrating tiie steps in sequencing a single stranded nucleic acid 

template attached by a haiipm (or other anchorii^ sequence) to a substrate. 



-13- 



wo 2004/050916 



PCT/GB2003/005266 



Fig. 3 is a diagram showing a hairpin containing a nicking site of the nicking 
endonuclease N.BsA4BI. 

Fig. 4 is a diagram showing a haitpin containing a cleavage site of blunt end 
endonuclease Myl. 

5 Fig. S is a diagram showing a double-stranded nucleic acid anchor containing a 

nicking site of die nicking endonuclease N.^^^^!. 

DETAILED DESCRIPTION 

The present invention relates to a method for regenerating a single-stranded nucleic 

10 acid template following its conversion to a double-stranded product, e.g., during a sequencing 
reaction. The invention also relates to single-stranded templates capable of being regenerated 
according to the invention. The invention also relates to the removal of a double-stranded 
nucleic acid from its substrate, e.g., removal of a double stranded nucleic acid from another 
molecule anchoring it to a solid substrate, or from a hairpin nucleic acid anchoring tiie double 

1 S stranded nucleic acid to a solid substrate. 

Single-molecule sequencing allows complete genomes to be sequenced on a single 
microarray chip in a single sequencing reaction. The principle of this technology is that large 
numbers of short sequences from fragmented DNA are immobilized as single strands on a 
surface where they can be individually visualized with a sensitive microscope and camera. 

20 Every fragment is thai sequenced simultaneously with fluorescent nucleotides and a 

polymerase enzyme, and the sequence information from all of the molecules is recorded 
simultaneously within a single camera frame. The method does not rely on DNA 
amplification by PGR or any sub-cloning steps, instead, tiny quantities of DNA can be 
directly sequenced inunediately after being extracted from source. When a sequencing 

25 reaction is complete, the single stranded template strand can be regenerated by enzymatic 
cleavage of the newly synthesized sequencing strand as described herein. 

For example, a hairpin nucleic acid containing a restriction site is provided, i.e., a 
single-stranded nucleic acid with a region of internal complementarity is capable of 
hybridizuig to itself and forming a hairpin) and also containing a restriction site. The hairpin 

30 nucleic acid has, near its 3' end, a restriction site for a nickmg endonuclease. The restriction 
site is situated so that the nicking endonuclease will nick at a point before, at, or beyond the 3' 
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end of the sin^e-stranded nucleic acid. A nicking endonuclease acting upon such a lestriction 
site in such a nucleic acid is shown in Fig. 1. 

To use the hairpin to recover a template nucleic acid according to flie inv^ition, a 
single-stranded nucleic acid template is attached to the 5* end of the hairpia This can be 
done in a number of ways. A single-stranded nucleic acid can be attached to the hairpin. 
Alternatively, a double-stranded nucleic acid can be attached to the hairpm. Alternatively, a 
double-stranded nucleic acid can be attached to the hairpin, and either one strand ligated to 
the haiipin, or both strands can be Ugated and then one strand removed, e.g., according to the 
methods described herein. The hairpin nucleic acid is then self-annealed to form a hairpin 
with an attached template nucleic acid. Alternatively, the hairpm can be self-annealed first, 
with the smgle-stranded template nucleic acid bemg then being attached to the hairpin. Once 
the template nucleic acid is attached to the hairpm, it is ui a position to be "recovered" 
following a sequencing or other reaction that builds up a strand complementary to the 
template nucleic acid, and attached to the 3' end of the hairpin. 

During such a reaction, such as that shown m Fig. 2, single nucleotides are generally 
incorporated onto the 3' end of the hahpm, whore each nucleotide is complementary to the 
nucleotide opposite it on the template strand. The end result of such a reaction is tiiat the 
single-stranded template nucleic acid is no longer single-stranded; instead, it is base-paired to 
a synthetic complementary strand. The result is a double-stranded nucleic acid molecule; the 
original template nucleic acid and its synthetic complementary strand, attached to a hairpin 
nucleic acid. 

The template nucleic acid can then be recovered according to tiie invention, tiiat is, the 
complementary strand can be removed by contacting tiie double-stranded nucleic acid 
molecule plus hairpin with a nicking endonuclease that is capable of recognizing tiie 
restriction site tiiat is in tiie hairpin nucleic acid, near what was its origmal 3' end. Because 
tiie resteiction site is situated so tiiat tiie nicking endonuclease will create a "nick" at a point 
near, at, or beyond the original 3* end of tiie hairpin nucleic acid, the nick will be made 
before, at, or just beyond, tiie junction between what was originally tiie 3* end of flie hairpm, 
and tiie start of the strand complementary to the tenq^late nucleic acid (see, e.g.. Fig. 1). 

When a nick is introduced, tiie sequence distal to tiie cleavage is no longer contiguous 
witii tiie sequence proximal to it That is, tiie haiipin and tiie synflietic complementary strand 
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are no longer contiguous. RaOier, the synthetic complementaiy stremd effectively becomes a 
separate, discrete single strand of nucleic acid that is hybridized to fhe template nucleic add. 
The synthetic complementary strand is thus amenable to being Avashed away by denaturing 
the overall nucleic acid complex by using heat or chaotropic conditions such as high 
5 concentrations of urea. After the synthetic strand is >vashed away, the template nucleic acid is 
still attached to the hairpin, and is available for re-sequencing or other applications (see, e.g.. 
Fig. 1). 

Although the embodiment described above uses a hairpin containing a single 
restriction site for a nicking endonuclease, the sequence of the hairpin can be designed to 

10 contain multiple restriction sites, e.g., for nicking endonucleases or other types of enzymes, 
such as blimt end endonucleases and/or ordinary restriction enzymes. 

For instance, the hairpin can contain restriction sites for both a nicking endonuclease 
and a blunt end endonuclease. Wth such a hairpin, one can choose to either recover the 
template by selectively removing the synthetic complement, as desoibed above, or by use of 

1 5 the blunt end endonuclease, to remove both the synthetic complement and the template, 
leaving only the hairpin. 

The invention discloses the use of a 'nicking' class of enzyme to regenerate the 
template DNA on an arrayed surface, or a Type lis endonuclease to regenerate a blunt hairpin. 
Both of these enzymes may share a common restriction site, or may vise different restriction 

20 sites. Two of the enzymes discussed herein, N.5j^NBI and Mlyl^ exemplify two enzymes that 
share a common restriction site. In this case, the two enzymes recognize the same sequence 
of nucleotides, but actually leave at different locations. In the case of enzymes that do not 
share a common restriction site, the different restriction sites can be included in the design of 
the hairpin/anchor sequence. 

25 The invention can be used to recover the original template in an array, e.g:, a device 

where multiple nucleic acid sequences are attached to a substrate, e.g., a device in which 
fragments of nucleic acid, eg, DNA, from a genome of interest are attached to the surface of 
a glass slide by ligation to a DNA hairpin. 

An advantage of the ability to regenerate a template is that a second and subsequent 

30 round of sequencing on the same template should eliminate any random sequencing errors 
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that arose during the first round of sequencing. The method is Aerefore useful in confirming 
sequencing data. 

In general, the invention is useftd in situations where a single-stranded nucleic acid 
template has been made double-stranded, e.g., in a sequencing reaction, and there is then a 
need to remove the complementary strand that was synthesized and attached to the template. 

Such a sequencing method is illustrated in Fig. 2. The sequence of bases in a template 
strand is determined by employmg a polymerase enzyme to synthesize a complementary 
strand on the template strand one base at a time. Fig. 2 shows a substrate with a hairpin 
attached, and a template strand (with the nucleotides represented by circles and squares) 
attached to one of the ends of the hairpin. Individual bases are then added, each labeled with 
a difierent label, e.g., each with a different fluorophore. One complementary base is attached 
to fee end of the hairpin (or end of the growing synthetic strand) by incorporation, e.g., by a 
polymerase, to tiie growing complementary strand. The identity of tiie complementary 
nucleotide is then determined by detection of the fluorophore, e.g., by washing away 
unmcorporated labeled nucleotides and subsequent detection of tiie attached fluorophore. The 
label is tiien cleaved off the recentiy-mcorporated nucleotide, e.g., by chemical means, and a 
nucleotide complementary to the next nucleotide in tiie template is incorporated into tiie 
growing complementary strand, the label detected and identified, and tiien cleaved off. 
Subsequent cycles of incorporation, detection and cleavage result in tiie sequencing of tiie 
complementary strand, and perforce, tiie deduction of tiie sequence of tiie original template 
nucleic acid. Fig. 2 shows tiie template attached to a hairpin, but the template could 
alternatively be attached to a segment of double-stranded nucleic acid, e.g., a double-stranded 
nucleic acid anchor. 

After a series of such incorporations, tiie original template strand is no longer single 
stranded, instead, it is base-paired to a growing syntiietic complementary stiand. Eventually, 
tiie template strand may become entirely double-stranded. The invention described herein 
enables botii reuse of tiie device by recovery and further interrogation of tiie sequenced 
template nucleic acid by removal of tiie syntiietic complementary strand, or regeneration of 
the blunt hairpins on the solid substrate. 

In one embodiment, tiie hairpin nucleic acid used to attach tiie single-stranded 
template to tiie soUd substrate has been designed suchtiiat it contains vwtinn its sequence a 
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restriction site for a niddng cndonuclease. A "nickiiig endwmclease" is one of a class of 
enzymes that bind reversibly to a specific site in double-stranded nucleic acid and tben cleave 
a phosphodiester bond in only one strand at a short distance tcom the enzyme»s binding site. 
The result is a *nick' in one strand of the double-stranded nucleic acid, rather than cleavage of 
5 both strands. In general, the nicks occur at the 3*-hydioxyl. 5»-phosphate. When a nick is 
produced m a section of double-stranded nucleic acid, the sequence distal to the restriction 
site and cleavage site is no longer contiguous with the main body of the double-stranded 
nvicleic acid. It becomes, in essence, a single strand hybridized to the rest of the nucleic acid. 
It can therefore be washed away by denaturing the nucleic acid using heat or by using 
10 chaotropic conditions such as high concentrations of urea. 

Several enzymes are known to nick DNA in a single strand but most are found in 
multiple protein complexes involved m DNA repUcation or m DNA repan, and as such, have 
before now had limited applications in manipulating DNA in vitro. However, a number of 
these enzymes are commercially available and can be used to nick DNA under simple 
15 reaction conditions. For example, N.5j/NBI (available fiomNewEnglandBiolabs, Beverly, 
Massachusetts, USA) has been used to prepare substrates for studies into DNA repah 
mechanisms. This and other such enzymes are shown in Table 1, below. A number are 
available commercially (e.g.,N.AlwI,N.BsfNBI,N.BbvCIA andN.BbvCIB are available 
from New England BioLabs, Inc., Beverly, Massachusetts, USA), hiformation on enzymes 
20 and their cleavage sites can be found in the relevant scientific literature, and/or in public 
databases, e.g., REBASE (Robert et al, 2001, Nucl Acids Res. 29:268-269) Crebase/"), 
which is maintained by New England Biolabs on its web site ("neb.com"). 



Table 1. Nicking endonucleases and their restriction sites. 



Enzyme 


Restriction Site 
(5' to 3') 


Isoschizomers 


N.AlwI 


GGATCNNNN'^ 




N.BbvCIA 


GC^TGAGG 




N.BbvCIB 


CC^TCAGC 




N3pulOIA 


GC^TNAGG 
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NJBpulOIB 


CC^TNAGC 




N.BspD6I 


GAGTCNNNN'^ 


N.Bst9I N.BstNBI NBstSBI NJ^yl 


NBst9I 


GAGTCNNNN'^ 


N.BspD6I NJBsfNBI N3stSEI N.MlyI 


N.BstNBI 


GAGTCNNIW^ 


N.BspD6I N.Bst9I NJBstSEI N.MlyI 


NJBstSm 


GAGTCNNNN'^ 


N.BspD6I N3st9I NJBsiNBI KMlyl 


N.CviPn 






N.CviQXI 


R^AG 




N.MlyI 


GAGTCNNNNN'^ 





The position of Ifae restriction site of the nicking endonuclease can be chosen so that 
the enzyme cleaves the synthetic complementary strand from the main body of the hairpin and 
genomic template stand. After this detached section is washed away, the template strand 
5 remains attached to the hairpin and is available for re-sequencing or other applications. 

N,BstNBI recognizes the asymmetric sequence GAGTC (SEQ ID NO:l) in double 
stranded DNA and nicks between the fourth and fifth base downstream of this sequence in the 
same strand. As described herein, this restriction site has been incorporated into the 3' end of 
DNA hairpins such that the N.^i-^NBI enzyme nicks the hairpin just upstream of the synthetic 
10 complementary strand, thereby detaching it from the hairpin. 

Such a hairpin is shown in Fig. 3. The linear sequence of the hairpin is 5'- 
NNNNGACTC . . . (hairpin loop) . . . GAGTaNfNNN-3\ The four nucleotides represented 
by "n" on the lower strand represent the synthesized nucleotides complementary to the four 
template sequence nucleotides represented by '"N** on the upper strand. The enzyme 
1 S N.£5/NBI will nick the complementary strand at the position indicated by the arrow, thereby 
releasing the lower sequence **nnim". 

The incorporation of this particular restriction site into the hairpin has an added 
advantage in that it is also recognized by another endonuclease, MlyL In contrast to 
N.jB^rNBI, this enzyme cleaves the hairpin in both strands between the fifth and sixth base 
20 downstream of the restriction site to produce a blunt end. Thus, the addition of this enzyme 
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following a sequencing reaction on a hairpin allows flie ori^nal blunt hurpin to be 
regenerated, as is shown in Fig. 4. 

♦*Blunt end endonucleases" are those which hydrolyze both strands of a nucleic acid, 
and do so without leaving an overhanging end. A number of blunt end endonucleases are 
S listed in Table 2, below. 



Table 2. Blunt end endonucleases (Type II). 



Enzyme 


Restriction Site 
(5* to 3*) 


Isoschizomers 


Aham 


TTT'^AAA 


DralPauAii orui 


Alul 


AG^T 


Mltl 


BaU 


TGC^CCA 


Mlsl Mlu311 MlUNl MSCl MSpzui 


BfrBI 


ATG^CAT 




BloHn 


CTGCA'Xj 




BsaAI 


YACXJTR 


BstBAl MspYl rSUAl 


BsaBI 


GATNN'^NNATC 


BseSI BseJI Bshl365I BsiBI BsrBRI MamI 


BsrBI 


CCG^CTC 


AccBSI BstD102I BstSlNl Mbil 


Btrl 


CAC^GTC 


BmgBI 


Cacel 






CviJI 


RG^Y 


CviTI 


CviRI 


TG^A 


HpyCH4VHpyF44ni 


Eco47m 


AGC^T 


Afel Aia AorSlHI FunI 


Eco78I 


GGC^CC 


EgelEhelSfoI 


EcoICRI 


GAG^TC 


Ecll36n£coS3kIMxaI 


EcoRV 


GAT'^ATC 


Ceql Eco32I Hjal HpyCI NsiCI 


EsaBC3I 


TCKSA 




FnuDH 


cgx:g 


AccH BceBI BepI Bpu95I Bshl236I BspSOI Bspl23I 
BstFNI BstUI Bsul532I BUd Csp68KVI CspKVI FalH 
FauBIIMvnlThal 


FspAI 


RTGCKSCAY 
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Hael 


WGG^CW 




Haein 


GG^C 


BanAI BecAII BimlPH Bme361I BseQI BsM BsWl 
Bsp21 11 BspBRI BspKI BspRI BsuRI Btel CW Dsall 
EsaBC4I FnuDI MchAII MfoAl NgoPII NspLKI Pall 
Pdel33I PflKI Plal Sbvl SM Sual 


Hindn 


GTV^RAC 


IfinJCIHincn 


Hpal 


GTT'^AAC 


BstEZ359I BstHPI KspAI SsrI 


HpySI 


GTN'^NAC 


H^yBH 


Lpnl 


RGCKjCY 


Bmel42I 


Mlyl 


GAGTCNNNIW 


SchI 


MslI 


CAYNN^NNRTG 


SmiMI 


MstI 


TGCKSCA 


Accl6I AosI AviU FdiH Fspl Nsbl Paml Punl4627I 


Nael 


GCC^GC 


Ccol Pdil SauBMKI SauHPI SauLPX SauNI SauSI 
Slul777I 


NlalV 


GGN'^NCC 


AspNI BscBI BspLI PspN4I 


Nrul 


TCG^GA 


Bsp68I MIu2I SbolSI Spol 


NspBII 


CMG^KG 


MspAlI 


Olil 


CACNN'^NNGTG 


Alel 


PmaCI 


CAC^TG 


Acvl BbrPI BcoAI Eco72I Pmll 


Pmel 


GTTT'^AAAC 


MssI 


PshAI 


GACNN'^NNGTC 


BoxIBstPAI 


Psil 


TTA^TAA 


- 


PvuH 


CAG^TG 


Bavl BavAI BavBI Bspl53AI BspM39I Bsp04I Cfr6I 
Dmal EcllNmeRI PaelTkl Punl4627n Pvii84n 


Rsal 


GT'^AC 


AfelHpyBIPlaAn 


Seal 


AGT^ACT 




Soil 


CTC^AG 




Smal 


CCC^GG 


CfrJ4I PaeBI PspALI 


SnaBI 


TAC^TA 


BstSNlEcolOSI 


Srfl 


GCCCHjGGC 
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Sspl 


AAT'^ATT 




SspD5I 


GGTGANNNNNNNN^ 




StuI 


AGG^CT 


AatI AspMI Ecol47I Gdil Peel PmeSSI Sari Sru30DI 
SseBI Stel 


Swal 


ATTT'^AAAT 


BstRZ246I BstSWI MspSWI SmU 


Xcal 


GTA^TAC 


BspM90I BssNAI Bstl 1071 BstBSI BstZlTI 


XmnI 


GAANN'^NNTTC 


Asp700I BbvAI MroXI Pdml 


Zral 


GAC^TC 





It is to be vmderstood that the enzymes used in the invention can be those discovered 
in nature (i.e., naturally-occurring enzymes), or can be en2ymes created by mutation of 
existing enzymes. 

5 The regeneration protocol is not restricted solely to arrays containing hairpin DNA 

molecules or DNA molecules constructed on hairpins (e.g.^ ligated genomic DNA). Instead, 
the template can be attached to a double-stranded nucleic acid "anchor" fliat incorporates the 
restriction site(s). Such an embodiment is shown in Fig. 5 for the N.^^/NBI enzyme. 
The method can be used on double-stranded arrays formed by hybridization of 
10 complementary sequences to a single-stranded array, for example, hybridization of a PGR 
product generated from primers containing a restriction site for a nicking enzyme. 
Furthermore, the protocol can be applied to other types of armys besides single-molecule 
arrays, /.e., arrays where multiple copies of the same DNA molecule are present at the same 
locus on the chip, 

15 The hairpin/anchor can also be designed to include one or more restriction sites for 

nicking endonucleases, blunt end endonucleases, or restriction endonucleases. 

For instance, the enzyme N-ife/NBI recognizes the sequence 5'-GAGTC-3% and acts 
by cleaving the strand between four and five nucleotides in the 3' direction fiom this 
sequence. This sequence can be incorporated into the hairpin: 

20 5'-NNNNGACTC . , . GAGTCNNNN-3', 

where . represents a number of nucleotides or other moieties added to form the **loop" of 
the hairpin. Because a hairpin sequence cannot immediately turn upon itself, it is preferable 
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to add 1 to 1000 nucleotides that will form the curve of Xt» loop between flie complementary 
portions of the sequence, preferably 1 to 100 nucleotides. 

The Mlyl restriction site can be "added" to the above sequence by merely adding an 
extra nucleotide: 

5*-NNNNNGACTC . . . GAGTCNNNNN-3'. 
This sequence would form the hairpin: 

2 

rCTCAGNNNN NT- 5' 
LgAGTCNNNNANA-3 ' 

1 2 

w^ere, when the sequence has formed a hairpin, the arrow "1" indicates the site of the nick 
made by tf^sitJBly and the arrow "2" indicates the site on each "strand" that is cut by Mlyl. 

One can also make use of enzymes that do not recognize the same site. For instance, 
the blunt end endonuclease SspDSI recognizes the sequence 5'-GGTGANNNNNNNN'^-3'. 
this site can be added into ihe hairpin shown above by overlapping tbs end of the SspDSI site 
with tiie ti.BstNBI and Adfyl sites: 

2,3 

pCCACTCATNNNN NT- 5' 

Lggtgagtcnnnnana-3 ' 

1 2,3 

where the arrow "1" indicates the site of the nick made by N.BsfNBI, and the arrow '•2,3" 
indicates the site on each "strand" that is cut by either Mlyl or SspDSI. 

There is no requirement that the cleavage sites of one or more of the enzyme be in 
common, and a mmiber of different sites can be incorporated into the same sequence. For 
instance, the following sequence 
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5' -GAGTCANACACADA-3' 

3 4 12 

has a nicking site for N^j/NBI (restriction site GAGTCNNNN'^) at the arrow "1", a cleavage 
5 site for the blunt cutter Mji (restriction site GAGTCNNNNN'^) at arrow "2", a cleavage site 

for the blunt cutter Hj^Zl (restriction site OTN-^NAC) at arrow "3", and a nicking site at 

arrow "4" for N.CWPII (restriction site C^D). Thus, a variety of restriction sites can be 

designed into the hairpin or anchor. 

The hairpin can also be designed to have an overhang, that is, one "strand" can be 
1 0 longer tiian Ae other. This increases the number of possible restriction sites ibat can be 

designed into the hairpin. For instance, the hairpin: 

rCTCAGNACCGGT-5 ' 

Lgagtcntgg-3' 

15 

can have a nucleic acid template added to its 5' end: 
l-CTCAGNACCGGTNNNN . . . -5' 

Lgagtcntgg -3 ' . 

20 

Synthesis of the complementary strand will produce the following double-stranded nucleic 
acid: 

2 3 

25 pctcagnacc gtgtvnnnn . . . -5' 
•ksagtcntggacacaannnn . . . -3' 

12 3 

which can be nicked at position 1 by N.&/NBI, and is cleavable across both strands at 
30 position 2 by Myl, and at position 3 by JBalU anotiier blimt cutter with restriction site 

TGGXilCA. The single stranded template can be removed by use oftf JBsitfBl, or the ori^nal 
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hairpin can be recovered by using BaO^ followed by liJBsiNBl to reco vot the oveihang. 
Altraiatively, a new type of blunt hairpin can be made by incorporating "CCA** onto the 3* 
end of the hairpin to make it completely double-stranded. 

Such overhangs can also be added to blunt hairpins by adding the oveihang in the 
5 same way one would add a single-stranded nucleic acid template. This can be used to 
engineer a variety of restriction sites into the new hairpin. The actual template can then be 
added to the new overhang. 

All of the hairpins and methods for designing such hairpins, as discussed above, can 
also be synthesized in the form of double-stranded nucleic acid "anchors", to be attached to a 
10 solid substrate, and to serve as an intermediate molecule anchoring the template to tiie solid 
substrate. 

All of the sequences described above have had restriction sites designed into the S* to 
3* strand of the hairpin/anchor, with the 5* end of the restriction site being closest to the 
substrate or anchoring point. Alternatively, however, this can be reversed. If one wished to 
15 use an enzyme that operates in the 3' to 5* direction, the sites can be designed into tiie other 
"strand" of the hairpin or the other strand of the anchor. 

The sites to be designed into the hairpins and anchors can be chosen for a variety of 
reasons, including an en^ane*s specificity or non-specificity, ease of use, longevity, etc. 

Alternatively, one can use enzymes that cleave beyond the 5* end of their recognition 
20 sites. Enzymes for use in this way can be those discovered in nature (/.e., naturally-occurring 
enzymes), or can be created by mutation of existing enzymes. Such enzymes include, e.g., 
Bcgly BsdXI and BssKI. BssKI, for example, cleaves as follows: 

5' . . . ^CCNGG . . .3' 
25 3' . . . GGNCC^ . . .5' 

A mutant of BssKI (or another enzyme) can be made which cleaves in only one strand. This 
site can be included in a hairpin or anchor as described herein, where the hairpin or anchor 
has non-cleavable phosphorothioate bonds on the 5* half of the hairpin, so that cleavage only 
30 occurs in the 3* half of the hairpin, thereby creating a nick. 
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In another embodunent, the hairpin nuddc add or doiible-stranded nuddc add 
andior can be designed so fhst flie portion to whidi the traiplate nudeic add is attadied 
contains non-deavable bonds. That is, in the portion of Hxe haiipin/andior to which tiie 
template nuddc add is attadied, the nudeotides are attached to eadi otiier by bonds which 
are not deavable by an endonudease. In sudi a hairpin/anchor, an ordinary restriction 
endonuclease can be used, but it will behave as a niddng endonuclease, and will deave only 
one strand - the one with the deavable bonds between the nudeotides. 

The non-<deavable bonds can be phosphorotiiioate bonds, which are easily added 
during the sjmthesis of the hairpin/anchor. Any modification of the phosphodiester bacldxine 
of the hairpin/anchor can be used, where the modification allows binding of the restriction 
endonuclease to the hairpin/anchor, but prevents cleavage of the strand containing the 
modifications. 

For instance, Aarn normally cleaves the following sequence: 

5'... G-A-C-G-T'^C ...3' 
3'... CT-G-C-A-G ...5' 

However, if the normal bonds '0 between the nucleotides at one of the cleavage 
dtes were replaced with bonds that are not deavable ("'=:")by Aaffl, then the cleavage pattern 
would resemble that of a nicking endonuclease: 

5'... G-A-C-G-T«C ...3' 
3'... C"T-G-C-A-G ...5' 

The use of endonucleases facilitates simple cleaving of the DNA at an exact position 
in natural DNA bases. Therefore, no additional costs are incurred in constructing the 
hairpin/anchor sequences. Furthermore, the use of an endonudease guarantees that DNA 
cleavage produces termini that are substrates for fiutiier manipulation by other enzymes such 
as ligases or polymerases. 

Regeneration of sin^e-stranded DNA templates on a sequencing chip or nudeic add 
array produces a spatially addressable anay where the sequence of DNA at every position on 
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the airay is known. Such an anay can be tieated with a polymerase ena^ 

dNTPs to produce a double-stranded anay that is also spatially addressable enabling the 

systematic analysis of DNA-protein interactions. 

The density of the single molecule arrays is not critical. However, the present 
invention can make use of a high density of hairpins/anchors, and these are preferable. For 
example, arrays with a density of 10^-10^ haupms/anchors per cm^ may be used. Preferably, 
the density is at least lOVcm^ and typically up to 10^/cm^. These single molecule arrays are in 
contrast to other arrays which may be described in the art as '"high density" but which are not 
necessarily as high and/or which do not allow single molecxile resolution. 

Using the methods and device of the present invention, it may be possible to image at 
least 10^ - 10^ preferably 10^ or 10* hairpins or anchors per cm^. Fast sequential imaging 
may be achieved using a scanning apparatus; diifting and transfer between images may allow 
higher ntimbers of hairpins/anchors to be imaged. 

The extent of separation between the individual hairpins/anchors on the array will be 
determined, in part, by the particular technique used to resolve the individual 
hairpins/anchors. Apparatus used to image molecular arrays are known to those skilled in the 
art. For example, a confocal scanning microscope may be used to scan the surface of the 
array with a laser to image directly a fluorophore incorporated on the individual 
hairpins/anchors by fluorescence. Alternatively, a sensitive 2-D detector, such as a charge- 
coupled device, can be used to provide a 2-D image representing the individual 
hairpins/anchors on the array. 

•^Resolving" single hairpins/anchors (and their attached templates and complements) 
on the array with a 2-D detector can be done if, at 100 x magnification, adjacent 
hairpins/anchors are separated by a distance of approximately at least 250 nm, preferably at 
least 300 nm and more preferably at least 350 nm. It will be appreciated that these distances 
are dependent on magnification, and that other values can be determined accordingly, by one 
of ordinary skill in the art. 

Other techniques such as scanning near-field optical microscopy (SNOM) are 
available which are capable of greater optical resolution, thereby permitting more dense 
arrays to be used. For example, using SNOM, adjacent hairpins/anchors may be separated by 
a distance of less than 100 nm, e.g., 10 nm. For a description of scanning near-field optical 
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midoscopy, see Moyer et al.. Laser Focus World (1993) 29(10). 

An additional technique that may be used is surfece-specific total internal reflection 
fluorescence microscopy (TIRFNQ; see, for example. Vale et al. Nature (1996) 380:451- 
453). Using this technique, it is possible to achieve wide-field imaging (up to 100 jun x 100 
Hm) with single molecule sensitivity. This may allow arrays of greater than 10^ resolvable 
hairpins/anchors per cm^ to be used. 

Additionally, the techniques of scanning tuimelling microscopy (Binnig et al., 
Helvetica PhysicaActa (1982) 55:726-735) and atomic force microscopy (Hansma et aI.,Ann. 
Rev. Biophys. Biomol. Struct. (1994) 23:1 15-139) are suitable for imaging the arrays of the 
present mvention. Other devices which do not rely on microscopy may also be used, provided 
that they are capable of imaging within discrete areas on a solid support 

Immobilisation to the siqpport may be by ^cific covalent or non-coval«it 
interactions. Covalent attachment is preferred. The immobilized hairpin/anchor is tiien able 
to undergo interactions with other molecules or cognates at positions distant fipom flie solid 
si^port. Immobilisation in this manner results in well separated hairpins/anchors. The 
advantage of this is that it prevents interaction between neighboxiring hairpins/anchors on Ihe 
array, which may hinder interrogation of the array. 

An array containing sequenced and regenerated templates can be used as an 
addressable platform for spatially organizing libraries of compounds attached to single 
stranded DNA tags. For example, a combinatorial library of drug compounds could be 
prepared with unique single stranded DNA tags or DNA mimics, e.g., PNA, and then added to 
a sequenced/regenerated array. This would generate a spatially addressable array of drug 
compounds on a chip. The same can be done for a protein library. Such chips could then be 
interrogated with probes to generate information about molecular interactions. 

The arrays described herein are effectively sin^e analyzable template nucleic acids. 
This has many important benefits for the study of the template sequences and their interaction 
with otiier biolo^cal molecules. In particular, fluorescence events occurring on each template 
nucleic acid can be detected usmg an optical microscope linked to a sensitive detector, 
resulting in a distinct signal for each template. 

When used in a multi-step analysis of a population of single templates, the phasing 
problems (loss of synchronisation) that are encountered using high density (multi-molecule) 
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arrays of Ae prior art, can be reduced or removed. Therefore, die arra>^ also pennit a 
massively parallel approach to monitoring fluorescent or other events on the ten^lates. SutA 
massively parallel data acquisition makes the arrays extremely usefijl in a wide range of 
analysis procedures which involve the screening/characterising of heterogeneous mixtures of 
templates. 



Example. 

Twenty microliters of solution is pmpsaced containing 50 pmoles of a DNA hairpin 
phosphorylated at its 5' end, 10 pmoles of a non-phosphorylated DNA double-stranded 
oligonucleotide, and several thousand units of a DNA ligase enzyme. The oligonucleotide is 
designed such that one strand is shorter than the other, making the oligonucleotide blunt- 
ended at one end and single stranded at the other, a 5' end. The single-stranded end carries a 
fluorescent label. The action of the ligase enzyme fuses tiie hairpin and the double-stranded 
oUgonucleotide at their blunt ends only, and because only the 5* end of the hairpm carries a 
phosphate group, the reaction results m joining one strand to tiie hairpin - the longer strand 
that carries the fluorescent group. 

The template is regenerated by taking a solution containing 2.5 pmoles of a 
fluorescently labeled strand of DNA that has been previously ligated to a blunt DNA hairpm. 
The single-stranded portion of this DNA construct, i.e., the template strand, can be made 
double-stranded by employing 1 Unit of Vent exo' polymerase (New England Biolabs, Inc., 
Beverly, Massachusetts, USA) to incorporate a mixture of four oligonucleotides, each at a 
concentmtion of 25 pmoles per reaction, at 75°C for 30 minutes. Upon completion, tiie 
reaction mixture is purified using a DNA purification kit (Qiagen, Hilden, Germany) and spUt 
in two. Half is kept for analysis and half (1.25 pmoles) is digested at 55°C for 30 minutes 
withN.Bj/NBI (5 Units; New England Biolabs, Inc., Beverly, Massachusetts, USA), which 
nicks the extended DNA construct proximal to the new synthetic stand. The formation of the 
synthetic complementary strand by the polymerase enzyme and its removal by digestion with 
file nicking enzyme can be analyzed by polyacrylamide gel electrophoresis, which 
distinguishes the DNA products by virtue of tiieu: differences m size. The presence of tiie 
fluorescent group ensures that the DNA molecules can be easfly detected. An identical 
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e?q>eiiineiit can be performed to demonstrate tiie regeneration of the blimt hairpin, except that 
the nicking enzyme }^.BsiNBI is substituted with tiie type lis enzyme, MyL 

This procedure can also be performed with little modification in a flow-cell where the 
substrate comprises DNA ligated to DNA hairpins that are covalently attached to the glass 
5 surface of the flow cell. In this case, the attachment of the DNA to a solid svq>port, the glass, 
obviates the need to employ a DNA purification kit between enzyme steps: instead, products 
can be removed and new reagents added by flowing solutions across through the cell. 

All patents, patent applications, and published references cited herein are hereby 
10 incorporated by reference in their entirety. While this invention has been particularly shown 
and described with references to preferred embodiments thereof, it will be understood by 
those skilled in tiie art that various changes in form and details may be made therem without 
departing from the scope of the invention encompassed by the appended claims. 
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